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Abstract -We examine a global disorder transition when identifying community structure in an 
arbitrary complex network. Earlier, we illustrated [Phil. Mag. 92, 406 (2012)] that "community 
detection" (CD) generally exhibits disordered (or unsolvable) and ordered (solvable) phases of 
both high and low computational complexity along with corresponding transitions from regular to 
chaotic dynamics in derived systems. Using an exact generalized dimensional reduction inequality, 
multivariate Tutte polynomials, and other considerations, we illustrate how increasing the number 
of communities q emulates increasing the heat bath temperature T for a general weighted Potts 
model, leading to global disorder in the community structure of arbitrary large graphs. Dimen- 
sional reduction bounds lead to results similar to those suggested by mean-field type approaches. 
Large systems tend toward global insolvability in the limit of large q above a crossover temperature 
T x w L\ J e |/ [AT In 5] where | J e \ is a typical interaction strength, L is the number of edges, and N 
is the number of nodes. For practical system sizes, a solvable phase is generally accessible at low 
T. The global nature of the disorder transition does not preclude solutions by local CD algorithms 
(even those that employ global cost function parameters) as long as community evaluations are 
locally determined. 
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Introduction. — Methods of statistical physics have 
enlarged the understanding of complex networks [1 . In 
particular, community detection (CD) 12] attempts to 
identify "mesoscopic" structure within these systems. Ap- 
plications of CD are extremely broad, and numerous meth- 
ods have been leveraged to solve it [3 14. The problem 



complexity and related aspects of community "detectabil- 
ity" were studied for an "absolute Potts model" (APM) 
modularity 



15 16 
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and mean-field type (cavity) 
approaches [19j,[20] where the latter references [l8| - [20] ex- 
amined general cluster detectability transitions in a special 
class of stochastic block models. Ref. 21 reviewed critical 
phenomena in complex networks. 

We illustrated [IB] that the APM, along with other CD 
approaches, exhibits solvable phases of "easy" or "hard" 
complexity and unsolvable phases with spin-glass-type or 
other transitions coinciding with transitions from ergodic 
to non-ergodic dynamics in mechanical analogs. Bashan 
et al. 22 showed that transitions in network topology 
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have physiological significance, implying that network 
transitions have relevance beyond detectability /solvability 
thresholds. We also previously demonstrated 123] how dis- 
tinct phases of the CD problem affect image segmentation 
applications. Other authors covered disorder transitions 
for random-bond Potts models 24 25 and zeros of the 
partition function |26 in the limit of a large number of 
Potts spin flavors (q ^> 1). 

As depicted in fig. [lj CD attempts to partition a graph 
into q optimally disjoint subgraphs (or communities). Op- 
timal values of q may be determined via multiscale meth- 
ods |9][10||27]|29]. Ref. [30] discussed the absence of large 
clusters in large real networks (i.e., q is large). 

We investigate a general weighted Potts model on an 
arbitrary graph with q 1, and we illustrate how in- 
creasing q emulates increasing the temperature T. For 
CD, it implies that whenever an algorithm or cost func- 
tion can be represented as a weighted Potts model, then 
large systems are inherently disordered on a global level 
above a crossover temperature T x . The result encom- 
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Fig. 1: (Color online) The schematic illustrates a community 
partition (distinct node shapes and colors) showing relevant 
structure in a graph with ferromagnetic (solid, black lines) and 
antiferromagnetic interactions (gray, dashed lines). Line thick- 
ness indicates the relative interaction strength J e . Antifer- 
romagnetic ("adversarial" with J e < 0) and non-interacting 
("neutral," unconnected with J e = 0) relations break up well- 
defined communities. 



passes a wide variety of CD methods including optimizing 
modularity [31], a Potts model applying a "configuration 
null model" (CMPM) [4j[32], an Erdos-Renyi Potts model 
4l[33l 



a "constant Potts model" 34 
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"label propagation" 
We spec- 
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the APM [10J|15], and others 
ulate that the disorder persists for systems with external 

as well as directed and multi- 
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magnetic interactions 
partite graphs. 

While the result applies to general Potts models on arbi- 
trary large graphs, it only implies disorder on a global, as 
opposed to local, level for large-q networks with bounded 

dm 



Refs. 



were 



coordination numbers and vertex count, 
shown to avoid a "resolution limit" imposed by global cost 
function parameters on some models [4[ [3l|[40 ■ 42 , but all 
weighted (or unweighted) Potts models would be subject 
to the global disorder imposed by large q. This global dis- 
order can be mitigated or avoided by solving the system 
locally or at sufficiently low T, but the latter condition 
exists in competition with beneficial thermal annealing ef- 
fects of increased temperature at sufficiently low T ( "order 
by disorder"" 
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The disorder induced by large q is quantitatively dif- 
ferent from that caused by high noise (extraneous inter- 
community edges) in a network 15 16 , 43 . A "glassy" 
transition due to noise may persist as T — > because the 
solution algorithm is frustrated by a complex energy land- 
scape exhibiting numerous local minima. 

Potts Hamiltonian. — We consider a general spin- 
glass-type Potts model Hamiltonian 



= -^J lj 8(a l ,a J ), 



(1) 



where Jij is the interaction strength between spins i and 
j and #(<Xj, <jj) = 1 if <7$ = <7j and otherwise. For CD, 



it is convenient separate the ferromagnetic (Jy > 0) and 
antiferromagnetic ( Jij < 0) interactions 



uy (1- (2) 



Given N nodes, {Ay ■} is the adjacency matrix where Aij = 
1 if nodes i and j are connected by a ferromagnetic edge 
and is otherwise. w,j > and Uij > are ferromagnetic 
and antiferromagnetic edge weights, respectively. Each 
spin ai may assume integer values in the range 1 < <7j < q 
where q is usually dynamically determined. Node i is a 
member of community k when a = k. 

The antiferromagnetic weights u\j provide a "penalty 
function" enabling a non-trivial CD ground state for an 
arbitrary graph. Some models incorporate a weight fac- 
tor, generally on the term, which allows the model to 
span different network scales in qualitatively similar ways. 
The APM penalizes "neutral" relationships (i.e., generally 
= — 1 for J^ < 0) . Another model 134] incorporates 
weighted antiferromagnetic interactions into io,j and ap- 
plies a separate penalty term. Algorithms for modularity 
and the CMPM are effectively implemented with dynamic 
edge weights on the tty term, but the fluctuations would 
be small in general approaching the ground state. Lo- 
cal CD models for energy calculations were suggested in 
40 41 , further advocated in [15], and explored in more 
detail in [34]. 

Dimensional reduction bound. — We first provide 
rigorous bounds on the disorder transition for community 



structure using dimensional reduction inequalities [44 45 
In the current context, these simple, yet exact, inequali- 
ties relate a system in any dimension to a local (D = 
dimensional) system composed of a single vertex (or a fi- 
nite collection of vertices) and its (their) neighbors. The 
derived bound has a form similar to that suggested by 
mean field considerations. 

In the thermodynamic limit, a bona fide transition may 
occur that marks symmetry breaking wherein, for in- 
finitesimally weak applied fields that favor a particular 
state, the probability that a given spin belongs to one 
of the q communities differs from l/q. From a practi- 
cal standpoint, we are interested in the probability that a 
particular spin cr takes on a specific "correct" spin value 
a that it does in a low energy configuration, effectively 
searching for a "planted" solution a. 

We derive upper bounds on the temperature for which 
the spin <ro attains its correct value with high confidence. 
Towards this end, we first detail general inequalities and 
then turn to their application in our case. We con- 
sider a partition of all spins into those of a local set r] 
(i.e., (Jq in the single-spin case) and all other remaining 
spins tp in the system. The trace over all spins becomes 
Tr{ CT } = Tr{^,}Tr{ ?; j. and the Hamiltonian (with or with- 
out any weak applied fields) becomes a function involving 
both sets of spins H({a}) = H ({ip}, {??})• Any thermal 
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average (/(»?)) can be written as 



Tr 



(/) = 



W z i> 



Tr 



(3) 



where we inserted Zy, = Tr^je - ^^^^^ > twice in the 
numerator which is valid for any Hamiltonian. As can 
be readily seen, the exact (/) over the large system can 
be written as a weighted sum (with positive normalized 
weights, Pip = z^p/Tv^'yz^i , that sum to unity) of local 
averages 



_ Tr w /(ry)e-^^.") 
(7)V = 



(4) 



Thus, (/) < (f)^;, where ip is a particular set of the spins 

tp that maximizes (f)^>. When we substitute ip = ip in H , 
we obtain a local Hamiltonian H(tp,r]) in the spins r\. 

If we set / = S (cto, a), then the mean value of (/) will 
correspond to the probability (/) = P(<r = a). When 
computing the internal general trace over 77, we evaluate 
(f)^p, in the case of a single spin at the origin ctq aver- 
aged over its q possible states. Applying (/) < (/W to 
(S (ctq, cr)), we obtain a generous upper bound. If the in- 
teraction between Uj at site j and (To at the origin is larger 
than (i.e., Jjo > 0), then ipj = a. If Jjq < 0, then we 
set ipj a. With this set of ip values, 



P = (f}< 



e pj +(q-l) 



(5) 



where J = \ J2 3 Jjo [1 + sgn(J j0 )]. 

On practical benchmarks, we are interested in cases 
where p exceeds some threshold value p* . The inverse 
temperature /3* at which the probability exceeds p* is 



/r 



Jo 



111 



p*(g-l) 
(l-p*) 



(6) 



At high q with p* = 1/2, this leads to rigorous upper 
bound (UB) for the associated crossover temperature 



T. 



UB 



Jo 



ttB In q 



(7) 



For T > T^ B , the correct assignment can only be de- 
termined with a probability p* = 1/2. If the exchange 
constants Jjo and the coordination number of cto are fi- 
nite and do not match or exceed the In q dependence in 
the denominator, the system is unsolvable at any positive 
temperature as q — > 00. The bound of eq. Q for the node 
at the origin is local, so it may change for other nodes. 

In practice, some parts of the network can exhibit struc- 
ture at higher temperatures which serves as a bottleneck 
for global ordering. Generally, the bounds of (/) < (/)^ 
enable a reduction of the full physical system to a re- 
lated problem that occupies a reduced D-dimcnsional sub- 
volume of the entire system. If we define the external state 



V> as a set of spins with the average spin value, then the 
resulting average becomes a mean-field average. 

We discuss a general representation for Potts model 
where, with it and related approaches, we estimate the 
form of the cross-over (or transition) temperature T x from 
a viable low temperature ordered phase to a high temper- 
ature disordered regime. The scaling in these results is 
similar to that of the rigorous bound in eq. Q. 

Multivariate Tutte polynomial estimate. The 



multivariate Tutte polynomial 46 is defined as a sub- 
graph expansion over A C £ of a graph G = (V,£) where 
V and £ are the sets of vertices and (ferromagnetic and 
antiferromagnetic) edges, respectively. 



ACS 



k(A) 



e'CA 



(8) 



where k(A) is the number of connected components of 
Ga — (V,A), v e = exp (/3J e ) — 1, and J e is the interaction 
strength of edge e. In CD, large q necessarily implies a 
large number of nodes \V\ = N. 

For two disjoint partitions A and B with G = A U B, 
Z(G; q, v) = Z(A; q, va)Z(B; q, Vg) where and Vg are 
the edge weights in the respective subgraphs. For un- 
weighted systems, the interaction strength is J e = J = ±1 
where + and — correspond to ferromagnetic or antiferro- 
magnetic interactions, respectively. 

There is a slight terminology distinction between CD 
and the energy contributions in eq. ([TJ) . Edges with J e > 
correspond to the wtj ferromagnetic ( "friendly" or "coop- 
erative") interactions in eq. ([2]), and J e < relates to 
the Uij anti-ferromagnetic (neutral and "adversarial" in 
some models) or absent (neutral in some models) interac- 
tions. The edge effect is conceptually consistent with CD 
for J e > 0, but antiferromagnetic weights are also related 
by an edge when calculating eq. Q. That is, an inter- 
action exists, but it is antiferromagnetic in nature. In 
CD, repulsive antiferromagnetic interactions correspond 
to adversarial relationships which act like neutral (uncon- 
nected) relations that hinder community structure. 

For large T, we require T 3> max e6 £ \J e \. The leading 
order terms for an arbitrary graph are due to A$ — {0} 
and A e — {e} for each edge e € £ ■ We also include the 
last A — £ term of G which is addressed later in the text. 



\£\ 

n Vf , 

/'=i 



Z(G;q,v) = q* +■■■ + <! 

e'=l ^ 

(9) 

which applies to arbitrarily large systems of size N. The 
free energy per site / = — In Z becomes 



f w -k B T\nq - — 2^ — 

e' = l y 



(10) 



where we invoke the small x approximation, In (1+x) ~ x, 
and neglect the last A = £ term in eq. ([9|. For large T, 
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J e /(fcsT) and 



1 |£| 
/~ -k B Tlnq- — 

e'=l 



J* 
'7 



(11) 



When compared to eq. (15 1 for large q, it implies that q 



emulates T when T is large. 

When q 3> max ee £ \v e \, the leading order terms in 1/q 
are still the first two in eq. ([9|. As T — >• 0, higher order 
terms in w e become increasingly important which is domi- 
nated by the last subgraph term for A — £. The three dis- 
played terms are "universal" since they apply equally well 
to all graphs {e.g., lattices, Cayley trees, random graphs, 
etc.). That is, they only depend on the system size N, the 
number of links L, and the number of connected compo- 
nents k(G) for the full graph G. 

All displayed terms in eq. ([9| are identical for regular 
lattices and similar-coordination-number Bethe lattices. 
Similar results are obtained for other graphs where Bethe 
lattice approximants are only identical for these terms. 
The important non-universal terms in the subgraph ex- 
pansion [denoted by ellipsis] depend on the particular 
graph structure. For the leading and last terms, respec- 
tively, the logarithms of the terms flesh out the typical 
scales of the high temperature entropic and low tempera- 
ture energetic contributions to the free energy. 

In the large-g limit, the zeroes of Z for constant v pro- 
vide the relevant transition temperatures in the N — > oo 
limit, and the free energy per site becomes non-analytic. 
We can estimate disorder transition temperatures by 
comparing the second and last terms, q N ~ 1 J2 e ,Ve ' *° 



Above T x , the large-g contributions dominate, and the 
system is in a disordered state, but it is globally ordered 
for T <C T x . For moderate levels of noise in a graph, 
larger d (with a well-defined community structure) actu- 
ally increases the crossover temperature. This indicates 
that additional noise up to an insurmountable threshold 
allows the system to explore the phase space more com- 
pletely when the community structure is solved [43] . For 
degree distributions seen in CD (e.g., often a power law), 
the corresponding crossover temperature would spread or 
split into multiple values which model the distinct features 
of the graph. In the limit of large N, the crossover(s) 
would become an approximate transition point. 

In order to highlight the similarity between the large-g 
and T behaviors, we fix T = T' > T x , define the constant 
{q) = k B T' {exp [J e /(k B T')\ - 1}, and rewrite eq. (fTol) as 



J^-k B T'lnq-±Yl 



(15) 



e' = l 



Large q in eq. (151 emulates large T in eq. (UlL J, 



A<i) 



is 



exponentially weighted in j3' , so a non-zero (perhaps small) 
region of stability is ensured except in the presence of high 
noise 15 16 43]. Transitions between contending minima 
in random embedded systems |16| , contending states with 
multi-scale (e.g., hierarchical) structures 10 , or others 
may occur over a range of temperatures. 

Mean-field and free energy estimates. The 

mean-field transition temperature for lattices with a fixed 
coordination number d, constant exchange J, and arbi- 



Il/'=i assuming a "typical" interaction strength trary q is [47 48 



q k(G) 

\J e \ so that we can solve the equation. If q is large then 
the latter term in eq. ^ will compete with the last term 
suggesting a crossover temperature 



T. 



MF 



Jd(q - 2) 



\Jr 



fcsln [qW-k{G)]/L + !) 



(12) 



2k B (q-l)ln(q-l) 
for q > 3. This equation yields a large-g limit of 

dJ 



T. 



MF 



under the assumptions N 3> 1 and L 3> 1. For general 
{ J e }, we may see multiple transitions spread over a range 
of T. In the limit as T — > 0, eq. ( 12 ) becomes 



2k b m<7 



(16) 



(17) 



in agreement with eq. ( 14 ) . The q — > oo limit on fer- 



L\J e 



k B [N - k(G)]\nq' 



(13) 



romagnetic lattices asymptotically approaches the mean- 
field theory result with translationally invariant J and con- 
stant d 



47 49 



If we instead compare Vf/q to 1 in eq. the factor 
L/N disappears, but the logarithmic behavior in q re- 



mains. Eq. ( 13 ) diverges for an arbitrarily large complete 
graph [L = N(N — l)/2 and N —> oo], and it approaches 
zero as q — > oo. Often in CD, the graph is (almost) com- 
pletely connected (in a topological sense) so N ^> k(G). 
For sparse graphs, L oc N, so 



t: 



Sparse 



d\J e 



The Gibbs-Bogoliubov-Feynman inequal- 
ity also allows a method for deriving optimal mean-field 
approximations in general. 

We can ascertain the same asymptotic behavior in q 
by analyzing the free energy per site if we flip a spin in 
a ground state. Assuming that the energy and entropy 
changes are uncorrelatcd, the energy change for the node 
flip is AU ~ d\J e \ up to an undetermined constant fac- 
tor, and the entropy change is AS" ~ In q yielding a free 
energy change AF ~ d\ J e \ — ksTlnq. The entropy contri- 
bution dominates (see [26] on general lattices) AF above 



2k B In q 

where d is an average node degree. 



(14) a crossover estimate 



T: 



FE 



d\J e \ 

k B lng 



(18) 
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Fig. 2: (Color online) The figure depicts q independent max- 
imal sub-graphs of size rii for i = 1 to q. The line thickness 
illustrates the relative strengths. With strictly ferromagnetic 
weights as depicted, this system is the strongest possible com- 
munity structure since it is "perfectly" defined via a maximal 
number of internal edges with no external edges to muddle 
the natural structure. Any antiferromagnetic weights would 
weaken the community structure much like "neutral" relations 
act to break up a well-defined community. In eqs. fF|)-< |18[ ), we 
show that all large-g CD problems that are representable as 
a general weighted Potts model experience global disorder at 
high q above a crossover temperature T x , and we calculate the 
partition function and the free energy per site of this system 



as a specific example in eqs. ( 19 1— ( 21 1 



which agrees well with cq. (14 1. As we alluded earlier, the 
logarithms of the leading and last terms in eq. Q trivially 
provide the typical entropic and energetic contributions to 
the free energy at high and low temperatures. 

Non-interacting cliques example. The most 
strongly defined community structure is a system of q non- 
interacting cliques (maximally connected sub-graphs with 
no intercommunity relationships that obscure community 
structure). We define q weighted cliques with sizes n, for 
i = 1 to q as depicted in fig. [2] The partition function at 
high T or high q with T > T x is (43 



9 n» / j-l 



i=l j=l 



k=l 



9 



where lj — (j — — 2)/2. Eq. (19) is equivalent to eq 



([9| to first order in Vk ■ When T » T x , high q results in the 
same approximation as high T affirming the implication 



made with eqs. (11) and (15). 



For high T specifically, we make the additional approx- 
imation wj, w (3Jk, and the free energy per site becomes 



/ « -k B T\nq - 



En 

q 



(20) 



using the same approximations as in eq. (10). En = 



J2 e ' Je'/N is the energy per site. In ref. [43], we derived 
•.NIC _ ( n ~ !) J 



for constant J and q non-interacting cliques of fixed size 
n. With d = n — 1 edges per node, the result coincides 
well with eqs. Jlil, (fl7| and (pL 



711> 
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2k b lng 



(21) 



Thermal annealing comments. For heat bath 
or simulated annealing (SA) algorithms in CD, when 
T -C L\J e \/[NkB^nq], the global system remains in an 
ordered state. For higher T, the global system becomes in- 
creasingly disordered in terms of partition function state 
probabilities. If we consider states "near" equilibrium, 
small fluctuations in the energy result in only tiny changes 
to the state probabilities through the Boltzmann weight. 
Another perspective is that even for a system of non- 
interacting cliques depicted in fig. [2j larger q creates a 
greater probability that a non-negligible fraction of cliques 
will be disconnected at a given T at any given point during 
the stochastic solution. 

Most SA algorithms (e.g., [4J) utilize energy differences 
to evaluate dynamic changes to the community division, 
so the system is effectively solved locally (algorithmically 
speaking, global parameters in the cost function are a sep- 
arate issue [4] [l^[3T|[34l[40||42||50| ) . In practice, SA is lim- 
ited to systems with O(10 4 ) nodes without significant par- 
allelization, so greedy algorithms (T = 0) are used on the 
largest systems 5 p5p5] . SA implements a cooling scheme 
to fine tune solutions determined by the high T optimiza- 
tion, and our results indicate that cooling becomes more 
important for large-g systems. 

Conclusion. — We showed a global disorder transi- 
tion at a large number of communities q for a general 
weighted (or unweighted) Potts model over essentially ar- 
bitrary graphs. The community structure of a complex 
network may be globally disordered at large q but still 
be locally ordered and locally solvable. Our results en- 
compass many popular cost functions utilized for commu- 
nity detection, including modularity and common Potts 
model variants. We demonstrated this effect using strin- 
gent exact bounds as well as related results suggested 
by mean-field and other general approaches. With these 
bounds, results for a local system that occupies only a sub- 
volume of the original system lead to rigorous results for 
the full system, and they may have similar applications in 
the analysis of other hard computational problems where 
mean-field approaches are commonly applied. We also il- 
lustrated that in the strongest possible model partition, 
that of non-interacting cliques, the large-g limit induces 
disorder akin to random thermal effects. 

Increasing q emulates increasing T in arbitrary graphs 
for any CD method that may be represented as a gen- 
eral weighted Potts model. The asymptotic behavior of 
the global disorder transition varies slowly in q, T x sa 
L\ J e | /[Nks hag], meaning that problems of practical size 
maintain a finite region of solvability given a stochastic 
heat bath algorithm. Local algorithm dynamics (even 
for models which incorporate global weighting parame- 
ters) serve to circumvent the global disorder transition. 
This global disorder is generally circumvented by the of- 
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ten used SA algorithm, but "glassy" problems with high 
noise (many extraneous intercommunity edges) would re- 
main a challenge for any algorithm or model. 
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